Make copy/graft/prune work with unevenly distributed rows#5807
Merged
Make copy/graft/prune work with unevenly distributed rows#5807
Conversation
mangas
reviewed
Feb 10, 2025
| /// | ||
| /// The word 'ogive' is somewhat obscure, but has a lot fewer letters than | ||
| /// 'piecewise linear function'. Copolit also claims that it is also a lot | ||
| /// more fun to say. |
ff607b0 to
b19b6c1
Compare
mangas
approved these changes
Feb 11, 2025
b19b6c1 to
157c291
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When we copy/graft/prune, we split the entire work that needs to be done into batches that are meant to take roughly three minutes to avoid bloating the
subgraph_deploymenttable. Pruning causes a very serious problem with that, and when that happens it can be crippling for the performance of the overall system.The code that adjusts the size of the batch to hit that target tacitly assumes that the actual work is distributed linearly, i.e., if we ask for work covering 10,000 rows (going by
vid), we are fine with getting fewer rows, maybe even just a handful, but this needs to be uniform: any 10,000 row batch needs to have roughly the same number of rows. Pruning breaks this assumption since in a pruned subgraph, the beginning of the subgraph (as determined by block numbers) will be much sparser than the later parts. In one case, this misled the estimation logic to eventually try and copy 160M rows since that's what the early part of the subgraph indicated could be copied in the three minutes, as the subgraph was pruned and the range of 160M row numbers only contained 128 rows in the beginning of the subgraph. After that, the subgraph was dense and copying 160Mvid's would take many hours.This PR removes the assumption that the relation between
vidand actual rows is linear. It uses thehistogram_boundsfrompg_statsto build a piecewise linear function, and estimates the number of rows in a givenvidrange using that piecewise linear function (theOgivein the code) Now, when we ask for a batch of 10,000 rows, the code will adapt to an unevenviddistribution and return different sizevidranges for different parts of the table.